专利摘要:
apparatus and method for the acquisition of spatially selective sound by acoustic triangulation. a device for capturing audio information from a destination location is provided. the apparatus comprises a first beam generator (110) being arranged in a recording environment and having a first recording characteristic, a second beam generator (120) being arranged in the recording environment and having a second recording characteristic and a generator signal (130). the first beam generator (110) is configured to record a first audio signal from the beam generator and the second beam generator is configured to record a second audio signal from the beam generator when the first beam generator (110) and the second beam generator (120) is directed to the destination location with respect to the first and second recording characteristics.
公开号:BR112013013673B1
申请号:R112013013673-1
申请日:2011-12-02
公开日:2021-03-30
发明作者:Herre Jürgen;Küch Fabian;Kallinger Markus;Grill Bernhard;Giovanni Del Galdo
申请人:Fraunhofer-Gesellschaft Zur Eorderung Der Angewandten Forschung E.V;
IPC主号:
专利说明:

Description
The invention relates to audio processing and, in particular, to an apparatus for capturing audio information from a destination location. In addition, the application refers to the acquisition of spatially selective sound by acoustic triangulation.
The acquisition of spatial sound aims to capture the entire sound field that is present in a recording room or just certain desired components of the sound field that are of interest to the application at hand. As an example, in a situation where several people in a room are talking, it may be of interest to capture the entire sound field (including its spatial characteristics) or just a signal that a certain speaker produces. The latter allows you to isolate the sound and apply specific processing to it, such as amplification, filtration, etc.
There are several known methods for spatially selectively capturing certain sound components. These methods generally employ microphones with a high steering or microphone networks. Most methods have in common that the microphone or the microphone network is arranged in a known fixed geometry. The spacing between the microphones is the smallest possible for coincident microphone techniques, while they are usually a few centimeters for the methods. In the following, we refer to any device for the directionally selective acquisition of spatial sound (for example, directional microphones, microphone networks, etc.) as a beam generator. (spatial) in sound capture, that is, a spatially selective sound acquisition can be obtained in several ways:
One possible way is to employ directional microphones (for example, cardioid, super cardioid, or "shotgun" microphones). Thus, all microphones capture sound differently depending on the direction of arrival (DOA | direction-of-arrival) relative to the microphone. On some microphones, this effect is less, as they capture the sound almost regardless of the direction. These microphones are called omnidirectional microphones. Typically on these microphones, a circular diaphragm is attached to a small, airtight compartment, see, for example, [EaOl] Eargle J. "The Microphone Book" Focal press 2001. If the diaphragm is not attached to the compartment and the sound reaches it equally on each side, its directional pattern has two lobes of equal magnitude. It captures sound at an equal level from the front and rear of the diaphragm, however, with reversed polarities. This microphone does not capture sound from directions parallel to the diaphragm plane. This directional pattern is called a dipole or figure-of-eight. If the omnidirectional microphone compartment is not airtight, but a special construction is made, which allows sound waves that propagate through the compartment and reach the diaphragm, the directional pattern is somewhere between omnidirectional and dipole (see [EaOl ]). The patterns can have two lobes, however, the lobes can have different magnitudes. The patterns can also have a single lobe; the most important example is the cardioid pattern, where the directional function D can be expressed as D = 0.5 (1 + cos (θ)), where θ is the direction of arrival of the sound (see [EaOl]). This function quantifies the relative magnitude of the sound level captured from a plane wave at the angle θ with respect to the angle with the greatest sensitivity. Omnidirectional microphones are called zero-order microphones and other patterns mentioned earlier, such as dipole and cardioid patterns, are known as first-order patterns. These types of microphones do not allow the pattern to be formed arbitrarily, as their pattern of directivity is almost completely determined by their mechanical construction.
Some spatial acoustic structures also exist and can be used to create directional patterns closer to microphones than those of first order. For example, if a tube that has holes is attached to an omnidirectional microphone, a microphone with a very narrow directional pattern can be created. Such microphones are called a shotgun or rifle (see [EaOl]). They typically do not have flat frequency responses and their direction cannot be controlled after recording.
Another method for building a microphone with directional characteristics is to record the sound with a network of directional or omnidirectional microphones and apply the signal processing later, see, for example, [BW01] M. Brandstein, D. Ward: "Microphone Arrays - Signal Processing Techniques and Applications ", Springer Berlin, 2001, ISBN: 978-3- 540-41953-2. There are a variety of methods for this. In the simplest form, when the sound is recorded with two omnidirectional microphones close together and subtracted from each other, a signal from the virtual microphone with a dipole characteristic is formed. See, for example, [ElkOO] GW Elko: "Superdirectional microphone arrays" in SG Gay, J. Benesty (eds.): "Acoustic Signal Processing for Telecommunication", Chapter 10, Kluwer Academic Press, 2000, ISBN: 978-0792378143 . The microphone signals can also be delayed or filtered before adding them together. In beam generation, a signal corresponding to a narrow beam is formed by filtering each signal from the microphone with a specially designed filter and then adding them together. This "sum-filter beam generation" is explained in [BS01]: J. Bitzer, KU Simmer: "Superdirective microphone arrays" in M. Brandstein, D. Ward (eds.): "Microphone Arrays - Signal Processing Techniques and Applications ", Chapter 2, Springer Berlin, 2001, ISBN: 978-3- 540-41953-2.
These techniques are blind to the signal itself, for example, they are not aware of the direction of arrival of the sound. Also, estimating the "direction of arrival" (DOA) is a task, see, for example, [CBH06JJ. Chen, J. Benesty, Y. Huang: "Time Delay Estimation in Room Acoustic Environments: An Overview", EURASIP Journal on Applied Signal Processing, Article ID 26503, Volume 2006 (2006). In principle, many different directional characteristics can be formed with these techniques. To form patterns of very selective spatially arbitrary sensitivity, however, a large number of microphones are required. In general, all of these techniques depend on the distances of the adjacent microphones which are small compared to the wavelength of interest. Another way to perceive directional selectivity in sound capture is parametric spatial filtering. The standard designs of the beam generator, which can, for example, be based on a limited number of microphones and which have time-invariant filters in their filter and sum structure (see [BS01]) generally exhibit only limited spatial selectivity. Recently, to increase spatial selectivity, parametric spatial filtering techniques have been proposed by applying spectral gain functions (time variant) to the input signal spectrum. The gain functions are designed based on the parameters, which are related to the human perception of spatial sound. A spatial filtering approach is present in [DIFF2009] M. Kallinger, G. Del Galdo, F. Küch, D. Mahne, and R. Schultz-Amling, "Spatial Filtering using Directional Audio Coding Parameters," in Proc. IEEE Int. Conf, on Acoustics, Speech, and Signal Processing (ICASSP), Apr. 2009, and is implemented in the domain of the DirAC I Directional Audio Coding parameters, an efficient spatial coding technique. Directional Audio Coding is described in [Pul06] Pulkki, V., '' Directional audio coding in spatial sound reproduction and stereo upmixing, '' in Proceedings of The AES 28th International Conference, pp. 251-258, Pitea, Sweden, June 30 - July 2, 200 6. In DirAC, the sound field is analyzed at a location where the active intensity vector as well as the sound pressure is measured. These physical quantities are used to extract the three DirAC parameters: sound pressure, direction of arrival (DOA) and sound diffusion. DirAC makes use of the assumption that the human auditory system can only process one direction by time-and-frequency clipping. This assumption is further explained by other spatial audio coding techniques such as MPEG Surround, see, for example: [VIDEO 106] L. Villemoes, J. Herre, J.
Breebaart, G. Hotho, S. Disch, H. Purnhagen, and K. Kjbrling, "MPEG Surround: The Forthcoming ISO Standard for Spatial Audio Coding," in AES 28th International Conference, Pitea, Sweden, June 2006. The filtering approach spatial, as described in [DIFF2009], allows a choice almost free of spatial selectivity. Another technique makes use of comparable spatial parameters. This technique is explained in [Fal08] C. Faller: "Obtaining a Highly Directive Center Channel from Coincident Stereo Microphone Signals", Proc. 124th AES convention, Amsterdam, The Netherlands, 2008, Preprint 7380. In contrast to the technique described in [DIFF2009], in which a spectral gain function is applied to an omnidirectional microphone signal, the approach in [Fal08] makes use of two cardioid microphones.
The two parametric spatial filtering techniques mentioned depend on the microphone spacing, which is small compared to the wavelength of interest. Ideally, the techniques described in [DIFF2009] and [Fal08] are based on coincident directional microphones.
Another way of perceiving directional selectivity in sound capture is to filter the microphone signals based on the coherence between the microphone signals. In [SBM01] KU Simmer, J. Bitzer, and C. Marro: "Post-Filtering Techniques" in M. Brandstein, D. Ward (eds.): "Microphone Arrays - Signal Processing Techniques and Applications", Chapter 3, Springer Berlin, 2001, ISBN: 978-3-540-41953-2, a family of systems is described, which employs at least two microphones (not necessarily directional) and processing of its output signal is based on the coherence of the signals . The underlying assumption is that diffuse background noise will appear as inconsistent parts in the two microphone signals, while a source signal will appear coherently in these signals. Based on this premise, the coherent part is extracted as a sign of the source. The techniques mentioned in [SBM01] were developed due to the fact that filter-and-sum beam generators with a limited number of microphones are hardly capable of reducing diffuse noise signals. No assumptions about the location of the microphones are made; not even the spacing of microphones needs to be known. A major limitation of traditional approaches to acquiring spatially selective sound is that the recorded sound is always related to the location of the beam generator. In many applications, however, it is not possible (or practicable) to place a beam generator in the desired position, for example, at a desired angle with respect to the sound source of interest.
Traditional beam generators can, for example, employ microphone networks and can form a directional pattern ("beam") to capture the sound from one direction - and reject the sound from other directions. Consequently, there is no possibility to restrict the sound capture region with respect to its distance from the microphone network capture.
It would be extremely desirable to have a capture device that can selectively capture the original sound not only from one direction, but directly restricted to the original from a location (point), similar to the way that a local closing microphone at the desired location would perform.
The purpose of the present invention is to provide improved concepts for capturing audio information from a target location. The objective of the present invention is solved by an apparatus for capturing audio information, according to claim 1, a method for capturing audio information, according to claim 14 and a computer program, according to claim 15 .
A device for capturing audio information from a destination location is provided. The apparatus comprises a first beam generator being arranged in a recording environment and having a first recording characteristic, a second beam generator being arranged in the recording environment and having a second recording characteristic and a signal generator. The first beam generator is configured to record a first audio signal from the beam generator and the second beam generator is configured to record a second audio signal from the beam generator when the first beam generator and the second beam generator they are directed to the destination location with respect to the first and second recording characteristics. The first beam generator and the second beam generator are arranged so that a first virtual straight line, 20 being defined to pass through the first beam generator and the destination location, and a second virtual straight line, being defined to pass through the second beam generator and the target location, are not parallel to each other. The signal generator is configured to generate an audio output signal based on the first audio signal from the beam generator and the second audio signal from the beam generator so that the audio output signal reflects relatively more information from destination location audio compared to destination location audio information in the first and second audio signals from the beam generator. With respect to a three-dimensional environment, preferably the first virtual straight line and the second virtual straight line intersect and define a plane that can be arbitrarily oriented.
At present, means for capturing sound in a spatially selective manner are provided, that is, sound reproduced originating from a target location specifies only as if a closing "local microphone" has been installed at this location. Instead of actually installing this local microphone, however, its output signal can be simulated using two beam generators placed in different distant positions.
These two beam generators are not closely positioned with each other, but they are located so that each of them performs an independent directional sound acquisition. Their "beams" overlap at a desired point and their individual outputs are subsequently combined to form a final output signal. In contrast to the other possible approaches, the combination of two individual outputs does not require any information or knowledge about the position of the two beam generators in a common coordinate system. Thus, the entire configuration for the local acquisition of a virtual microphone comprises two beam generators that operate independently, plus a signal processor that combines both individual output signals with the remote "local microphone" signal.
In one application, the apparatus comprises a first and a second beam generator, for example, two space microphones and a signal generator, for example, a combination unit, for example, a processor, to perform the "acoustic intersection". Each spatial microphone has a clear directional selectivity, that is, it attenuates the original sound of locations outside its beam as compared to the original sound of a location within its beam. Space microphones operate independently of each other. The location of the two space microphones, which are also flexible in nature, is chosen so that the spatial location of the destination is located at the geometric intersection of the two beams. In a preferred application, the two space microphones form an approximately 90 degree angle to the target location. The combination unit, for example, the processor, may not be aware of the geometric location of the two space microphones or the location of the target source.
According to an application, the first beam generator and the second beam generator are arranged with respect to the target location so that the first virtual straight line and the second virtual straight line intersect, and so that they intersect at the destination location with an angle of intersection between 30 degrees and 150 degrees. In another application, the angle of intersection is between 60 degrees and 120 degrees. In a preferred application, the angle of intersection is approximately 90 degrees.
In one application, the signal generator comprises an adaptive filter having a plurality of filter coefficients. The adaptive filter is arranged to receive the first audio signal from the beam generator. The filter is adapted to modify the first audio signal from the beam generator depending on the filter coefficients to obtain a first audio signal from the filtered beam generator. The signal generator is configured to adjust the filter filter coefficients depending on the second audio signal from the beam generator. The signal generator can be configured to adjust the filter coefficients so that the difference between the first audio signal from the filtered beam generator and the second beam generator from the second audio signal is reduced.
In one application, the signal generator comprises an intersection calculator for generating the audio output signal in the spectral domain based on the first and second audio signals of the beam generator. According to one application, the signal generator may further comprise an analysis filter bank to transform the first and second audio signals from the beam generator of a time domain into a spectral domain, and a filter bank of synthesis to transform the audio output signal from a spectral domain into a time domain. The intersection calculator can be configured to calculate the audio output signal in the spectral domain based on the first audio signal from the beam generator being represented in the spectral domain 20 and the second audio signal from the beam generator being represented in the spectral domain .
In another application, the intersection calculator is configured to compute the audio output signal in the spectral domain based on the cross spectral density of the first and the 25 second audio signal from the beam generator and based on a power spectral density of the beam. first or second audio signal from the beam generator.
According to an application, the intersection calculator is configured to compute the audio output signal in the spectral domain using the formula
characterized by the fact that Yi (k, n) is the audio output signal in the spectral domain, where Sx (k, n) is the first audio signal of the beam generator, where Ci2 (k, n) is a cross spectral density of the first and the second audio signal of the beam generator, and where Px (k, n) is a spectral energy density of the first audio signal of the beam generator, or using the formula
characterized by the fact that Y2 (k, n) is the audio output signal in the spectral domain, where S2 (k, n) is the second audio signal of the beam generator, where Ci2 (k, n) is a cross spectral density of the first and the second audio signal of the beam generator, and where P2 (k, n) is a spectral energy density of the second audio signal of the beam generator.
In another application, the intersection calculator is adapted to calculate both the Yx (k, n) and Y2 (k, n) signal and to select the smallest of the signals as the audio output signal.
In another application, the intersection calculator is configured to compute the audio output signal in the spectral domain using the formula
characterized by the fact that Y3 (k, n) is the audio output signal in the spectral domain, where Si is the first audio signal of the beam generator, where Ci2 (k, n) is a cross spectral density of the first audio signal from the beam generator, where Pi (k, n) is a spectral energy density of the first audio signal from the beam generator, and where P2 (k, n) is a spectral energy density of the second signal audio from the beam generator, or using the formula
where Y4 (k, n) is the audio output signal in the spectral domain, where S2 is the second audio signal from the beam generator, where Ci2 (k, n) is a cross spectral density of the first and the second audio signal from the beam generator, where Pi (k, n) is a spectral energy density of the first audio signal from the beam generator, and where P2 (k, n) is a spectral energy density of the second audio signal from the beam generator.
In another application, the intersection calculator can be adapted to both the Y3 (k, n) and Y4 (k, n) signal and to select the smaller of the two signals as the audio output signal.
According to another application, the signal generator can be adapted to generate the audio output signal by combining the first and second audio signals of the beam generator to obtain a combined signal and weighting the combined signal by a gain factor. The combined signal can, for example, be weighted in a time domain, in a sub-range domain or in a Fast Fourier Transformed domain.
In another application, the signal generator is adapted to generate the audio output signal by generating a combined signal so that the value of the spectral energy density of the combined signal is equal to the minimum of the value of the spectral energy density of the first and the second audio signal from the beam generator for each time-frequency clipping considered.
Preferred applications of the present invention will be explained with respect to the accompanying figures in which: Fig. 1 illustrates an apparatus for capturing audio information from a destination location according to an application, Fig. 2 illustrates an apparatus according to an application using two beam generators and a stage to calculate the output signal, Fig. 3a illustrates a beam generator and a beam generator beam being directed to the target location, Fig. 3b illustrates a beam generator and a beam generator beam showing more details, Fig. 4a illustrates a geometric configuration of two beam generators with respect to a target location according to an application, Fig. 4b describes the geometric configuration of the two beam generators of Figure 4a and three sound sources , and Fig. 4c illustrates the geometric configuration of the two beam generators of Figure 4b and three sound sources described in a more detailed illustration, Fig. 5 describes a signal generator according to an application, Fig. 6 illustrates a signal generator according to another application, and Fig. 7 is a flowchart that illustrates the generation of an audio output signal based on a cross spectral density and a spectral energy density of according to an application. Figure 1 illustrates an apparatus for capturing audio information from a destination location. The apparatus comprises a first beam generator 110 being arranged in a recording environment and having a first recording characteristic. In addition, the apparatus comprises a second beam generator 120 being arranged in the recording environment and having a second recording characteristic. In addition, the apparatus comprises a signal generator 130. The first beam generator 110 is configured to record a first audio signal from the beam generator Si when the first beam generator 110 is directed to the target location with respect to the first characteristic recording. The second beam generator 120 is configured to record a second audio signal from beam generator s2 when the second beam generator 120 is directed to the destination location with respect to the second recording characteristic. The first beam generator 110 and the second beam generator 120 are arranged so that a first virtual straight line, being defined to pass through the first beam generator 110 and the destination location, and a second virtual straight line, being defined to pass through the second beam generator 120 and the destination location, do not be parallel to each other. Signal generator 130 is configured to generate an audio output signal s based on the first audio signal from beam generator s∑ and the second audio signal from beam generator s2, so that the audio output signal s reflects relatively more audio information from the target location compared to the audio information from the target location on the first and second audio signals of the sx, s2 beam generator. Figure 2 illustrates an apparatus according to an application using two beam generators and a stage to calculate the output signal as the common part of the two individual output signals of the beam generator. A first beam generator 210 and a second beam generator 220 for recording a first and a second audio signal from the beam generator, respectively, are described. A signal generator 230 performs computational calculation of the part of the common signal (an "acoustic intersection"). Figure 3a illustrates a beam generator 310. The beam generator 310 of the application of Figure 3a is an apparatus for the directionally selective acquisition of spatial sound. For example, beam generator 310 can be a directional microphone or a microphone network. In another application, the beam generator may comprise a plurality of directional microphones. Figure 3a illustrates a curved line 316 surrounding a beam 315. All points on the curved line 316 that defines beam 315 are characterized in that a predefined level of the original sound pressure of a point on the curved line results in the same signal level emitted from the microphone to all points on the curved line.
In addition, Figure 3a illustrates a larger axis 320 of the beam generator. The major axis 320 of the beam generator 310 is defined in that a sound with a predefined level of sound pressure originating from a point considered on the major axis 320 results in a first level of signal emitted in the beam generator that is greater than or equal to a second level of signal emitted from the beam generator 5 resulting from a sound with the preset level of sound pressure originating from any other point having the same distance from the beam generator as the point considered. Figure 3b illustrates this in more detail. Points 325, 326 and 327 have equal distance d from beam generator 10 310. A sound with a preset level of sound pressure originating from point 325 on the long axis 320 results in a first signal level emitted in the beam generator which is greater than or equal to a second level of signal emitted from the beam generator resulting from a sound with the preset level of sound pressure originated, for example, from point 326 or point 327, which are the same distance d from the beam generator 310 than point 325 on the major axis. In the three-dimensional case, this means that the long axis indicates the point on a virtual sphere with the beam generator located in the center of the sphere, which generates the highest level of signal emitted in the 20 beam generator when a predefined sound pressure level originates. compared to any other point in the virtual sphere.
Returning to Figure 3a, a destination location 330 is further described. Destination location 330 can be a location from which sounds originate that a user intends to record using beam generator 310. To do this, the beam generator can be directed to the destination location to record the desired sound. In this context, a beam generator 310 is considered to be directed to a destination location 330, when the long axis 320 of the beam generator 310 passes through destination location 330. Sometimes, destination location 330 can be a destination area while in other examples, the destination location can be a point. If the destination 5 location 330 is a point, the major axis 320 is considered to pass through the destination location 330, when the point is located on the major axis 320. In Figure 3, the major axis 320 of the beam generator 310 passes through target location 330, and thus, beam generator 310 is directed to target location.
The beam generator 310 has a recording feature that indicates the ability of the beam generator to record sound depending on the direction that sound originates. The engraving characteristic of the beam generator 310 comprises the direction of the major axis 15 in space, the direction, the shape, and the properties of the beam 315, etc. Figure 4a illustrates a geometric configuration of two beam generators, a first beam generator 410 and a second beam generator 420, with respect to a location of destination 430. A first beam 415 of the first beam generator 410 and a second beam 425 of the second beam generator 420 are illustrated. In addition, Figure 4a depicts a first major axis 418 of the first beam generator 410 and a second major axis 428 of the second beam generator 420. The first beam generator 25 410 is arranged so that it is directed to the target location 430 , as the first major axis 418 passes through the target location 430. In addition, the second beam generator 420 is also directed to the target location 430, as the second major axis 428 passes through the target location 430.
The first beam 415 of the first beam generator 410 and the second beam 425 of the second beam generator 420 intersect at the destination location 430, where a destination source that emits sound is located. An angle of intersection of the first major axis 418 of the first beam generator 410 and the second major axis 428 of the second beam generator 420 is denoted as a. Optimally, the angle of intersection α is 90 degrees. In 10 other applications, the angle of intersection is between 30 degrees and 150 degrees.
In a three-dimensional environment, preferably, the first major axis and the second virtual major axis intersect and define a plane that can be arbitrarily oriented.
Figure 4b describes the geometric configuration of the two beam generators in Figure 4a, further illustrating three sources of sound srcl, src2, src3. Beams 415, 425 of beam generators 410 and 420 intersect at the target location, that is, the location of the target source src3. The source srci. and the source src2, 20 however, are located in one of the two bundles 415, 425 only. It should be noted that both the first and the second beam generator 410, 420 are adapted for directionally selective sound acquisition and their beams 415, 425 indicate the sound that is acquired by them, respectively. Thus, the first beam 425 25 of the first beam generator indicates a first recording characteristic of the first beam generator 410. The second beam 425 of the second beam generator 420 indicates a second recording characteristic of the second beam generator 420.
In the application of Figure 4b, the sources srci and src2 represent unwanted sources that interfere with the signal of the desired source src3. However, srcx and src2 sources can also be considered as the independent environment components collected by the two beam generators. Ideally, the output of a device according to an application would only return srca while completely suppressing the unwanted sources srp and src2.
According to the application of Figure 4b, two or even more devices for directionally selective sound acquisition, for example, directional microphones, microphone networks and corresponding beam generators, are employed to achieve "remote local microphone" functionality. Suitable beam generators can, for example, be microphone networks or highly directional microphones, such as shotgun microphones, and the output signals, for example, from microphone networks or highly directional microphones, can be used as audio signals from the generator. beam. The "remote local microphone" functionality is used to collect only sound originating from a restricted area around the point. Figure 4c illustrates this in more detail. According to one application, the first beam generator 410 captures the sound from a first direction. The second beam generator 420, which is located very far from the first beam generator 410, captures the sound from a second direction.
The first and second beam generators 410, 420 are arranged so that they are directed to the target location 430. In preferred applications, beam generators 410, 420, for example, two microphone networks, are distant from each other and return up to the destination point from different directions. This is different for processing the traditional microphone network, where only a single network is used and its different sensors are placed in close proximity to each other. The first major axis 418 of the first beam generator 410 and the second major axis 428 of the second beam generator 420 form two straight lines which are not arranged in parallel, but which intersect at an angle of intersection a. The second beam generator 420 would be optimally positioned with respect to the first beam generator, when the angle of intersection is 90 degrees. In applications, the angle of intersection is at least 60 degrees.
The destination point or destination area for sound capture is the intersection of both beams 415, 425. The signal in this area is derived by processing the output signals from the two beam generators 410, 420, so that an "intersection" acoustics "is calculated. This intersection can be considered as the part of the signal that is common / coherent between the two individual output signals of the beam generator.
This concept explains both the individual direction of the beam generators and the coherence between the output signals of the beam generator. This is different from the common processing of the microphone network, where only one network is used and its different sensors are placed close together.
At present, the sound emitted is captured / acquired from a specific destination location. This is in contrast to approaches that use distributed microphones to estimate the position of the sound sources, but which do not aim at an improved recording of the localized sound sources considering the output of the distant microphone networks as proposed according to applications.
In addition to using highly directional microphones, the concepts according to the applications can be implemented 5 with both classic beam generators and parametric spatial filters. If the beam generator introduces frequency-dependent amplitude and phase distortions, this would be known and considered for the computational calculation of the "acoustic intersection".
In an application, a device, for example a signal generator, computes a component of the "acoustic intersection". An ideal device for computing the intersection would deliver the total output, if a signal is present both in the audio signal from the beam generators (for example, the audio signals recorded by the first and the second beam generator) and would deliver zero output, if a signal is present only in one or more of the two audio signals from the beam generator. Good suppression characteristics that also guarantee a good performance of the device can, for example, be achieved by determining the transfer gain of a signal only present in an audio signal from the beam generator and adjusting it with respect to the transfer gain for a signal present in both audio signals of the beam generator.
The two audio signals from the beam generator Sj and 25 s2 can be considered as an overlay of a filtered common destination signal. delayed and / or scaled and individual noise / interference signals, nj. and n2, so that e s2 = f2 (s) + n2 where fi (x) and f2 (x) are the individual filtering, delay and / or scale functions present for the two signals. Thus, the task is to estimate s of Si = f i (s) + ni and s2 = f2 (s) + n2. To avoid ambiguities, f2 (x) can be defined as identity without loss in general.
The "intersection component" can be implemented in different ways.
According to an application, the common part between the two signals is calculated using filters, for example, classic adaptive LMS (Least Mean Square I Minimal Square Average) filters, as they are common for the cancellation of acoustic echo. Figure 5 illustrates a signal generator according to an application, in which a common signal s is calculated from signals Si and s2 using an adaptive filter 510. The signal generator in Figure 5 receives the first audio signal from the beam generator Si is the second audio signal from the beam generator s2 and generates the audio output signal based on the first and second audio signal from the beam generator Si and s2.
The signal generator in Figure 5 comprises an adaptive filter 510. A classical error adaptation / optimization processing scheme of the minimum quadratic mean, as known from acoustic echo cancellation, is performed by the adaptive filter 510. The adaptive filter 510 receives a first audio signal from the beam generator sx and filters the first audio signal from the beam generator Si to generate a first audio signal from the filtered beam generator s as an audio output signal. (Another suitable representation for s would be , However, for better readability, the time-domain audio output signal will be referred to as "s" below). The filtering of the first audio signal of the sx beam generator is conducted based on the adjustable coefficients of the adaptive filter 510.
The signal generator in Figure 5 emits the first audio signal from the filtered beam generator s. In addition, the audio output signal from the filtered beam generator s is also inserted into a difference calculator 520. The difference calculator 520 still receives the second audio signal from the beam generator and calculates the difference between the first signal from filtered beam generator audio se and the second audio signal from beam generator s2.
The signal generator is adapted to adjust the filter coefficients of the adaptive filter 510 so that the difference between the filtered version of sL (= s) and s2 is reduced. Thus, the signal s, that is, the filtered version of sT can be considered as representative of the desired coherent output signal. Thus, the signal s, that is, the filtered version of Sx represents the desired coherent output signal.
In another application, the common part between the two signals is extracted based on a coherence metric between the two signals, see, for example, the coherence metrics described in [Fa03] C. Faller and F. Baumgarte, "Binaural Cue Coding - Part II: Schemes and applications, "IEEE Trans, on Speech and Audio Proc., Vol. 11, no. 6, Nov. 2003. See also, the coherence metrics described in [Fa06] and [Her08].
A coherent part of the two signals can be extracted from the signals being represented in a time domain, but still, and preferably, from the signals being represented in a spectral domain, for example, a time / frequency domain. Figure 6 illustrates a signal generator according to an application. The signal generator comprises an analysis filter bank 610. The analysis filter bank 610 receives a first audio signal from the beam generator Si (t) and a second audio signal from the beam generator s2 (t). The first and second audio signals from the beam generator Si (t), s2 (t) are represented in a time domain; t specifies the time sample number of the respective audio signal from the beam generator. The analysis filter bank 610 is adapted to transform the first and the second audio signal from the beam generator Si (t), s2 (t) from a time domain into a spectral domain, for example, a time- frequency, to obtain a first Si (k, n) and a second S2 (k, n) audio signal from the spectral domain beam generator.
In Si (k, n) and S2 (k, n), k specifies the frequency index and n does not specify the time index of the respective audio signal from the beam generator. The analysis filter bank can be any type of analysis filter bank, such as Short Term Fourier Transform (STFT |
Short-Time Fourier Transform), polyphasic filter banks, Quadrature Mirror Filter filter banks (QMF I Quadrature Mirror Filter), but also filter banks such as Discrete Fourier Transform (DFT I Discrete Fourier Transform), Cosine Transform Discrete (DCT I Discrete Cosine Transform) and the analysis filter bank for the Modified Discrete Cosine Transform (MDCT I Modified Discrete Cosine Transform). To obtain a first and second audio signal from the spectral domain beam generator Si and S2, the characteristics of the audio signals from the beam generator Si and S2 can be analyzed for each time frame and for each of the various frequency bands.
In addition, the signal generator comprises an intersection calculator 620 to generate an audio output signal in the spectral domain.
In addition, the signal generator comprises a synthetic filter bank 630 for transforming the generated audio output signal from a spectral domain into a time domain. The synthesis filter bank 630 may, for example, comprise the Short Term Fourier Transform (STFT) synthesis filter bank, polyphasic synthesis filter bank, Quadrature Mirror Filter (QMF) synthesis filter banks. ), but also a synthesis filter bank such as Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and the Modified Discrete Cosine Transform (MDCT) synthesis filter bank.
In the following, possible ways of calculating the audio output signal, for example, by extracting a coherence, are explained. The intersection calculator 620 of Figure 6 can be adapted to calculate the audio output signal in the spectral domain according to one or more of these forms.
Coherence, as extracted, is the measurement of common coherent content while compensating for scale and phase change operations. See, for example: [FaOβ] C. Faller, "Parametric
Multichannel Audio Coding: Synthesis of Coherence Cues, "IEEE Trans, on Speech and Audio Proc., Vol. 14, no. 1, Jan 2006; [Her08] J. Herre, K. Kjõrling, J. Breebaart, C. Faller, S. Disch, H. Purnhagen, J. Koppens, J. Hilpert, J. Rõdén. W. Oomen, K. Linzmeier, KS Chong: "MPEG Surround - The ISO / MPEG Standard for Efficient and Compatible Multichannel Audio Coding", Journal of the AES, Vol. 56, No. 11, November 2008, pp. 932-955.
One possibility in generating an estimate of the coherent part of the signal from the first and the second audio signal of the beam generator is to apply the crossed factors to one of the two signals. Cross-factors can have an average time. Here, the relative delay between the first and the second audio signal of the beam generator is supposed to be limited, so that it is substantially less than the size of the filter bank window.
In the following, applications for calculating the audio output signal in the spectral domain extracting the part of the common signal and employing the correlation based on the approach based on an explicit calculation of a coherence measurement is explained in detail.
The signals Si (k, n) and S2 (k, n) denote the spectral domain representations of the audio signals of the beam generator where k is a frequency index and n is a time index. For each particular time-frequency clipping (k, n) specified by a particular frequency index k and a particular time index n, a coefficient exists for each of the signals Si (k, n) and S2 (k, n). From the two audio signals of the spectral domain beam generator Sx (k, n), S2 (k, n), the energy of the intersecting component is calculated. This energy of the intersection component can be calculated, for example, by determining the magnitude of the cross spectral density (CSD | cross-spectral density) Ci2 (k, n) of Si (k, n) and S2 (k, n): C12 (k, n) = IE {SHk, n) • S * 2 (k, n)} |
Here, the index * denotes the conjugate of a complex number and E {} represents the mathematical expectation. In practice, the expectation operator is replaced, for example, by the temporal or frequency leveling of the term Si (k, n) • S * 2 (k, n), depending on the time / frequency resolution of the filter bank employed.
The spectral energy density (PSD | power spectral density) Px (k, n) of the first audio signal of the beam generator Si (k, n) and the spectral energy density P2 (k, n) of the second audio signal of the beam generator S2 (k, n) can be calculated according to the formulas: Pi (k, n) = E {| Si (k, n) | 2} P2 (k, n) = E {| S2 ( k, n) I2}.
Next, applications for practical implementations of the computational calculation of the Y (k, n) acoustic intersection of the two audio signals of the beam generator are presented.
A first way to obtain an output signal is to modify the first audio signal from the beam generator Si (k, n):

Similarly, an alternative output signal can be derived from a second audio signal from beam generator S2 (k, n):

To determine the output signal, it may be useful to limit the maximum value of the gain functions Gi (k, n) and G2 (k, n) to a certain limit value, for example, to one. Figure 7 is a flowchart that illustrates the generation of an audio output signal based on a cross spectral density and an energy spectral density according to an application.
In step 710 a cross spectral density C12 (k, n) of the first and the second audio signal from the beam generator is calculated. For example, the formula described above Ci2 (k, n) = I E {Si (k, n) • S * 2 (k, n)} I can be applied.
In step 720, the spectral energy density Pi (k, n) of the first audio signal from the beam generator is calculated. Alternatively, the energy spectral density of the second audio signal from the beam generator can also be used.
Subsequently, in step 730, a gain function Gl (k, n) is calculated based on the cross spectral density calculated in step 710 and the energy spectral density calculated in step 720.
Finally, in step 740, the first audio signal from the beam generator Si (k, n) is modified to obtain the desired audio output signal Yi (k, n). If the energy spectral density of the second audio signal from the beam generator was calculated in step 720, then the second audio signal from the beam generator S2 (k, n) can be modified to obtain the desired audio output signal . Since both implementations have a single energy term in the denominator, which can become small depending on the location of the active sound source in relation to the two beams, it is preferable to use a gain that represents the ratio of the sound energy corresponding to the intersection acoustics and all or half of the sound energy collected by the beam generators. An output signal can be obtained by applying the formula
the formula:

In both examples described above, the gain functions will have small values in the event that the sound recorded in the audio signals of the beam generator does not comprise the components of the acoustic intersection signal. On the other hand, gain values close to one are obtained if the audio signals from the beam generator correspond to the desired acoustic intersection.
In addition, to make sure that only components appear in the audio output signal that correspond to the acoustic intersection (instead of the limited direction of the used beam generators) it may be advisable to calculate the final output signal as the smallest signal (by energy) of Y3 and Y2 (or Y3 and YJ, respectively. In one application, the signal Yj or Y2 of the two signals Yi, Y2 is considered to be the smallest signal, which has the lowest average energy. In another application, the signal Y3 or Y < it is considered as the smallest signal of both signals Y3, Y4, which has the lowest average energy.
Still, other ways of calculating the audio output signals exist that, different from what was described in relation to the previous applications, make use of both the first and the second audio signal of the beam generator Si and S2 (as opposed to only 5 using their powers) by combining them into a single signal that is subsequently weighted using one of the described gain functions. For example, the first and second audio signals from the beam generator Si and S2 can be added and the resulting sum can subsequently be weighted using one of the 10 gain functions described above.
The audio output signal of spectral domain S can be converted from a time / frequency representation into a time signal using a synthesis (reverse) filter bank.
In another application, the common part between the two 15 signals is extracted by processing the magnitude spectra of a combined signal (for example, a sum sign), for example, so that it has the intersection (for example, minimum) PSD (Spectral energy density) of both beam generator signals (normalized). The input signals can be analyzed in a selective way of time / frequency, as previously described, and an idealized assumption is made that the two noise signals are sparse and disjoint, that is, they do not appear in the same time / frequency cut-off. . In this case, a simple solution would be to limit the value of the spectral energy density (PSD) of one of the 25 signals to the value of another signal after an adequate renormalization / alignment procedure. It can be assumed that the relative delay between the two signals is limited so that it is substantially less than the size of the filter bank window.
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a function of a method step. Similarly, the aspects described in the context of a method step also represent a description of a corresponding block or item or function of a corresponding device.
A signal generated in accordance with the applications described above can be stored on a digital storage medium or it can be transmitted on a transmission medium as a wireless transmission medium or a wired transmission medium such as the Internet.
Depending on certain implementation requirements, applications of the invention can be implemented in hardware or in software. The implementation can be carried out using a digital storage medium, for example, a floppy disk, a DVD, a CD, a ROM memory, a PROM, an EPROM, an EEPROM or a FLASH, having readable control signals electronically stored on it, which cooperate (or can cooperate) with a programmable computer system so that the respective method is carried out.
Some applications according to the invention comprise a non-transitory data carrier having electronically readable control signals, which can cooperate with a programmable computer system, so that one of the methods described here is carried out.
Generally, the applications of the present invention can be implemented as a computer program product with a program code, the program code being operative to perform one of the methods when the computer program product operates on a computer. The program code can, for example, be stored on a machine-readable conveyor.
Other applications include the computer program to perform one of the methods described here, stored on a machine-readable conveyor.
In other words, an application of the inventive method is, in this way, a computer program having a program code to perform one of the methods described here, when the computer program operates on a computer.
Another application of the inventive methods is, in this way, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded on it, the computer program for carrying out one of the methods described here.
Another application of the inventive method is, in this way, a data stream or a sequence of signals that represents the computer program to perform one of the methods described here. The data stream or signal sequence can, for example, be configured to be transferred over a data communication connection, for example, over the Internet.
Another application comprises a processing medium, for example a computer, or a programmable logic device, configured or adapted to perform one of the methods described here.
Another application comprises the computer having the computer program installed in it to perform one of the methods described here.
In some applications, a programmable logic device (for example, an array of programmable logic gates) can be used to perform some or all of the functionality of the methods described here. In some applications, an array of programmable logic gates can cooperate with a microprocessor to perform one of the methods described here. Generally, the methods are preferably performed by any hardware device.
The applications described above are merely illustrative for the principles of the present invention. It is understood that the modifications and variations of the provisions and the details described here will be evident to other technicians in the subject. It is the intention, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented in the form of description and explanation of the applications here. References: [BS01] J. Bitzer, KU Simmer: "Superdirective microphone arrays" in M. Brandstein, D. Ward (eds.): "Microphone Arrays - Signal Processing Techniques and Applications", Chapter 2, Springer Berlin, 2001, ISBN : 978-3- 540-41953-2 [BW01] M. Brandstein, D. Ward: “Microphone Arrays - Signal Processing Techniques and Applications", Springer Berlin, 2001, ISBN: 978-3-540-41953-2 [CBH06 ] J. Chen, J. Benesty, Y. Huang: "Time Delay Estimation in Room Acoustic Environments: An Overview", EURASIP Journal on Applied Signal Processing, Article ID 26503, Volume 2006 (2006) [Pul06] Pulkki, V., '' Directional audio coding in spatial sound reproduction and stereo upmixing, '' in Proceedings of The AES 28th International Conference, pp. 251- 258, Piteâ, Sweden, June 30 - July 2, 2006. [DIFF2009] M. Kallinger, G Del Galdo, F. Küch, D. Mahne, and R. Schultz-Amling, "Spatial Filtering using Directional Audio Coding Parameters," in Proc. IEEE Int. Conf, on Acoustics, Speech, and Signal Processin g (ICASSP), Apr. 2009. [EaOl] Eargle J. "The Microphone Book" Focal press 2001. [ElkOO] GW Elko: "Superdirectional microphone arrays" in SG Gay, J. Benesty (eds.): "Acoustic Signal Processing for Telecommunication ", Chapter 10, Kluwer Academic Press, 2000, ISBN: 978-0792378143 [Fa03] C. Faller and F. Baumgarte," Binaural Cue Coding - Part II: Schemes and applications, "IEEE Trans, on Speech and Audio Proc., Vol. 11, no. 6, Nov. 2003 [Fa06] C. Faller, "Parametric Multichannel Audio Coding: Synthesis of Coherence Cues," IEEE Trans, on 25 Speech and Audio Proc., Vol. 14, no. 1, Jan 2006 [Fal08] C. Faller: "Obtaining a Highly Directive Center Channel from Coincident Stereo Microphone Signals", Proc. 124th AES convention, Amsterdam, The Netherlands, 2008, Preprint 7380. [Her08] J. Herre, K. Kjõrling, J. Breebaart, C. Faller, S. Disch, H. Purnhagen, J. Koppens, J. Hilpert, J Rõdén. W. Oomen, K. Linzmeier, K. S. Chong: "MPEG Surround - The ISO / MPEG Standard for Efficient and Compatible Multichannel Audio Coding", Journal of the AES, Vol. 56, No. 11, November 2008, pp. 932-955 [SBM01] KU Simmer, J. Bitzer, and C. Marro: "Post-Filtering Techniques" in M. Brandstein, D. Ward (eds.): "Microphone Arrays - Signal Processing Techniques and Applications", Chapter 3 , Springer Berlin, 2001, ISBN: 978-3- 540-41953-2 [Veen88] BDV Veen and KM Buckley. "Beamforming: A versatile approach to spatial filtering". IEEE ASSP Magazine, pages 4-24, Apr. 1988. [VÍ106] L. Villemoes, J. Herre, J. Breebaart, G. Hotho, S. Disch, H. Purnhagen, and K. Kjõrling, "MPEG Surround: The Forthcoming ISO Standard for Spatial Audio Coding, "in AES 28th International Conference, Pitea, Sweden, June 2006.
权利要求:
Claims (12)
[0001]
1. An apparatus for capturing audio information from a destination location, comprising: a first beam generator (110; 210; 410) being arranged in a recording environment and having a first recording characteristic, a second beam generator ( 120; 220; 420) being arranged in the recording environment and having a second recording characteristic, and a signal generator (130; 230), characterized in that the first beam generator (110; 210; 410) is configured to record a first audio signal from the beam generator when the first beam generator (110; 210; 410) is directed to the destination location with respect to the first recording characteristic, and where the second beam generator (120; 220; 420) is configured to record a second audio signal from the beam generator when the second beam generator (120; 220; 420) is directed to the destination location with respect to the second recording characteristic, where the first beam generator (110; 210; 410) and the second beam generator (120; 220; 420) are arranged so that a first virtual straight line, being defined to pass through the first beam generator (110; 210; 410) and the destination location, and a second virtual straight line, being defined to pass through the second beam generator (120; 220; 420) and the target location are not parallel to each other, and where the signal generator (130; 230) is configured to generate an audio output signal based on the first signal of beam generator audio and the second beam generator audio signal, so the audio output signal comprises relatively more audio information from the destination location compared to the audio information from the destination location on the first and second signals of the beam generator, wherein the signal generator (130; 230) comprises an intersection calculator (620) for generating the audio output signal in the spectral domain based on the first and second audio signal of the generator. beam, and in which the ca intersection calculator (620) is configured to compute the audio output signal in the spectral domain, computing a cross spectral density of the first and second audio signals from the beam generator and calculating the power spectral density of the first and second signal of beam generator audio.
[0002]
An apparatus according to claim 1, characterized in that the first virtual straight line and the second virtual straight line are arranged so that they intersect at the destination location with an angle of intersection so that the angle of intersection is between 30 degrees and 150 degrees.
[0003]
An apparatus according to claim 2, characterized in that the first virtual straight line and the second virtual straight line are arranged so that they intersect at the destination location so that the angle of intersection is approximately 90 degrees.
[0004]
An apparatus according to claim 1, characterized in that the signal generator (130; 230) comprises an adaptive filter (510) having a plurality of filter coefficients, wherein the adaptive filter (510) is arranged to receive the first audio signal from the beam generator, wherein the adaptive filter (510) is adapted to modify the first audio signal from the beam generator depending on the filter coefficients to obtain a first audio signal from the filtered beam generator as a signal audio output, and where the signal generator (130; 230) is configured to adjust the filter coefficients of the adaptive filter (510) depending on the first audio signal from the filtered beam generator and the second audio signal from the generator beam.
[0005]
An apparatus according to claim 4, characterized in that the signal generator (130; 230) is configured to adjust the filter coefficients so that the difference between the first filtered audio signal and the second audio signal of the beam generator is reduced.
[0006]
An apparatus according to claim 1, characterized in that the signal generator (130; 230) further comprises: an analysis filter bank (610) for transforming the first and second audio signals from the beam generator of a time domain for a spectral domain, and a synthesis filter bank (630) to transform the audio output signal from a spectral domain to a time domain, where the intersection calculator (620) is configured to calculate the audio output signal in the spectral domain based on the first audio signal of the beam generator being represented in the spectral domain and the second audio signal of the beam generator being represented in the spectral domain, where the calculation is performed separately in several bands frequency.
[0007]
An apparatus according to claim 1, characterized in that the intersection calculator (620) is configured to compute the audio output signal in the spectral domain using the formula.
[0008]
An apparatus according to claim 1, characterized in that an intersection calculator (620) is configured to compute the audio output signal in the spectral domain using the formula
[0009]
An apparatus according to claim 7, characterized in that the intersection calculator (620) is adapted to compute a first intermediate signal according to the formula
[0010]
An apparatus according to claim 1, characterized in that the signal generator (130; 230) is adapted to generate the audio output signal by combining the first and the second audio signal of the beam generator to obtain a combined signal and weighting the combined signal by a gain factor.
[0011]
An apparatus according to claim 1, characterized in that the signal generator (130; 230) is adapted to generate the audio output signal by generating a combined signal so that the value of the spectral energy density of the combined signal is equal to the minimum of the energy spectral density value of the first and the second audio signal of the beam generator for each time-frequency clipping considered.
[0012]
12. A method for computationally calculating the audio information of a destination location, comprising: recording a first audio signal from the beam generator by a first beam generator being arranged in a recording environment and having a first characteristic of recording when the first beam generator is directed to the destination location with respect to the first recording characteristic, the recording of a second audio signal from the beam generator by a second beam generator being arranged in the recording environment and having a second characteristic recording when the second beam generator is directed to the destination location with respect to the second recording characteristic, the generation of an audio output signal based on the first audio signal from the beam generator and the second audio signal from the generator beam so that the audio output signal reflects relatively more audio information from the destination location compared to the info audio information of the destination location in the first and second audio signals of the beam generator, characterized in that the first beam generator and the second beam generator are arranged such that a first virtual straight line, being defined to pass through the first beam generator and the destination location and a second virtual straight line, being defined to pass through the second beam generator and the destination location, are not parallel to each other, in which the audio output signal is generated in the spectral domain , calculating the first and second audio signals from the beam generator, and where the audio output signal is calculated in the spectral domain, calculating a cross spectral density of the first and second audio signals from the beam generator and calculating an power spectral density of the first and second audio signals of the beam generator.
类似技术:
公开号 | 公开日 | 专利标题
BR112013013673B1|2021-03-30|APPARATUS AND METHOD FOR THE ACQUISITION OF SPATIALLY SELECTIVE SOUND BY ACOUSTIC TRIANGULATION
ES2525839T3|2014-12-30|Acquisition of sound by extracting geometric information from arrival direction estimates
CA2857611C|2017-04-25|Apparatus and method for microphone positioning based on a spatial power density
BR112014013336B1|2021-08-24|APPARATUS AND METHOD FOR COMBINING SPATIAL AUDIO CODING FLOWS BASED ON GEOMETRY
JP5449624B2|2014-03-19|Apparatus and method for resolving ambiguity from direction of arrival estimates
BR112014013335B1|2021-11-23|APPARATUS AND METHOD FOR MICROPHONE POSITIONING BASED ON A SPACE POWER DENSITY
同族专利:
公开号 | 公开日
JP2014502108A|2014-01-23|
WO2012072787A1|2012-06-07|
KR20130116299A|2013-10-23|
MX2013006069A|2013-10-30|
TW201234872A|2012-08-16|
EP2647221B1|2020-01-08|
BR112013013673A2|2017-09-26|
ES2779198T3|2020-08-14|
AU2011334840B2|2015-09-03|
CA2819393C|2017-04-18|
US20130258813A1|2013-10-03|
US9143856B2|2015-09-22|
AR084090A1|2013-04-17|
CN103339961A|2013-10-02|
CA2819393A1|2012-06-07|
AU2011334840A1|2013-07-04|
EP2647221A1|2013-10-09|
KR101555416B1|2015-09-23|
RU2013130227A|2015-01-10|
RU2559520C2|2015-08-10|
CN103339961B|2017-03-29|
TWI457011B|2014-10-11|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JPH1124690A|1997-07-01|1999-01-29|Sanyo Electric Co Ltd|Speaker voice extractor|
JP3548706B2|2000-01-18|2004-07-28|日本電信電話株式会社|Zone-specific sound pickup device|
US8098844B2|2002-02-05|2012-01-17|Mh Acoustics, Llc|Dual-microphone spatial noise suppression|
WO2004059643A1|2002-12-28|2004-07-15|Samsung Electronics Co., Ltd.|Method and apparatus for mixing audio stream and information storage medium|
JP4247037B2|2003-01-29|2009-04-02|株式会社東芝|Audio signal processing method, apparatus and program|
DE10333395A1|2003-07-16|2005-02-17|Alfred Kärcher Gmbh & Co. Kg|Floor Cleaning System|
WO2006006935A1|2004-07-08|2006-01-19|Agency For Science, Technology And Research|Capturing sound from a target region|
US20070047742A1|2005-08-26|2007-03-01|Step Communications Corporation, A Nevada Corporation|Method and system for enhancing regional sensitivity noise discrimination|
WO2009049645A1|2007-10-16|2009-04-23|Phonak Ag|Method and system for wireless hearing assistance|
JP5032960B2|2007-11-28|2012-09-26|パナソニック株式会社|Acoustic input device|
EP2146519B1|2008-07-16|2012-06-06|Nuance Communications, Inc.|Beamforming pre-processing for speaker localization|
ES2425814T3|2008-08-13|2013-10-17|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus for determining a converted spatial audio signal|
WO2010028784A1|2008-09-11|2010-03-18|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus, method and computer program for providing a set of spatial cues on the basis of a microphone signal and apparatus for providing a two-channel audio signal and a set of spatial cues|
BR112013013673B1|2010-12-03|2021-03-30|Fraunhofer-Gesellschaft Zur Eorderung Der Angewandten Forschung E.V|APPARATUS AND METHOD FOR THE ACQUISITION OF SPATIALLY SELECTIVE SOUND BY ACOUSTIC TRIANGULATION|BR112013013673B1|2010-12-03|2021-03-30|Fraunhofer-Gesellschaft Zur Eorderung Der Angewandten Forschung E.V|APPARATUS AND METHOD FOR THE ACQUISITION OF SPATIALLY SELECTIVE SOUND BY ACOUSTIC TRIANGULATION|
WO2014167165A1|2013-04-08|2014-10-16|Nokia Corporation|Audio apparatus|
JP6106571B2|2013-10-16|2017-04-05|日本電信電話株式会社|Sound source position estimating apparatus, method and program|
CN104715753B|2013-12-12|2018-08-31|联想有限公司|A kind of method and electronic equipment of data processing|
US9961456B2|2014-06-23|2018-05-01|Gn Hearing A/S|Omni-directional perception in a binaural hearing aid system|
US9326060B2|2014-08-04|2016-04-26|Apple Inc.|Beamforming in varying sound pressure level|
DE102015203600B4|2014-08-22|2021-10-21|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|FIR filter coefficient calculation for beamforming filters|
WO2016114988A2|2015-01-12|2016-07-21|Mh Acoustics, Llc|Reverberation suppression using multiple beamformers|
RU2630161C1|2016-02-18|2017-09-05|Закрытое акционерное общество "Современные беспроводные технологии"|Sidelobe suppressing device for pulse compression of multiphase codes p3 and p4 |
JP6260666B1|2016-09-30|2018-01-17|沖電気工業株式会社|Sound collecting apparatus, program and method|
JP2018170617A|2017-03-29|2018-11-01|沖電気工業株式会社|Sound pickup device, program, and method|
US10789949B2|2017-06-20|2020-09-29|Bose Corporation|Audio device with wakeup word detection|
JP2019021966A|2017-07-11|2019-02-07|オリンパス株式会社|Sound collecting device and sound collecting method|
CN108109617B|2018-01-08|2020-12-15|深圳市声菲特科技技术有限公司|Remote pickup method|
JPWO2020066542A1|2018-09-26|2021-09-16|パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America|Acoustic object extraction device and acoustic object extraction method|
US10832695B2|2019-02-14|2020-11-10|Microsoft Technology Licensing, Llc|Mobile audio beamforming using sensor fusion|
DE102019205205B3|2019-04-11|2020-09-03|BSH Hausgeräte GmbH|Interaction device|
US10735887B1|2019-09-19|2020-08-04|Wave Sciences, LLC|Spatial audio array processing system and method|
WO2021226503A1|2020-05-08|2021-11-11|Nuance Communications, Inc.|System and method for data augmentation for multi-microphone signal processing|
法律状态:
2018-12-18| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2019-10-01| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-10-20| B25A| Requested transfer of rights approved|Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V. (DE) |
2021-01-26| B09A| Decision: intention to grant|
2021-03-30| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 02/12/2011, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US41972010P| true| 2010-12-03|2010-12-03|
US61/419,720|2010-12-03|
PCT/EP2011/071600|WO2012072787A1|2010-12-03|2011-12-02|Apparatus and method for spatially selective sound acquisition by acoustic triangulation|
[返回顶部]